g02gbf

g02gbf © Numerical Algorithms Group, 2002.

Purpose

G02GBF Fits a generalized linear model with binomial errors

Synopsis

[dev,idf,b,irank,se,cov,v,ifail] = g02gbf(x,isx,y,t<,link,wt,mean,offset,v,...
tol,maxit,iprint,epslon,weight,ifail>)

Description

 
 A generalized linear model with binomial errors consists of the 
 following elements:
 
 (a)   a set of n observations, y , from a binomial distribution:
                                 i                  
                          (t)    y        t-y
                          (y)(pi) (1-(pi))   .
 
 (b)   X, a set of p independent variables for each observation, 
       x ,x ,...,x .
        1  2      p
 
 (c)   a linear model:
                                  --         
                           (eta)= > (beta) x .
                                  --      j j
 
 (d)   a link between the linear predictor, (eta), and the mean of
       the distribution, (mu)=(pi)t, the link function, 
       (eta)=g((mu)). The possible links functions are:
                                     (  (mu) )
       (i)   logistic link: (eta)=log( ------),
                                     ( t-(mu))
       
                                     -1( (mu))
       (ii)  probit link: (eta)=(Phi)  ( ----),
                                       (  t  )
       
                                            (    (   (mu)))
       (iii) complementary log-log link: log(-log(1- ----)).
                                            (    (    t  ))
 
 (e)   a measure of fit, the deviance:
       
        n                 n   {     (  y   )           (
        --        ^^^^    --  {     (   i  )           (
        >  dev(y ,(mu) )= >  2{y log( -----)+(t -y )log(
        --      i     i   --  { i   ( ^^^^ )   i  i    (
        i=1               i=1 {     ( (mu) )           (
                              {     (     i)           (
         (t -y )  )}
           i  i   )}
        ----------)}
            ^^^^  )}
        (t -(mu) ))}
          i     i )}
 
 The linear parameters are estimated by iterative weighted least-
 squares. An adjusted dependent variable, z, is formed:
 
                                       d(eta)
                      z=(eta)+(y-(mu)) ------
                                       d(mu)
 
 and a working weight, w,
 
                                            _____________
            (      d(eta))2                /      t
          w=((tau) ------)  where (tau)=  /  ------------
            (      d(mu) )              \/   (mu)(t-(mu))
 
 At each iteration an approximation to the estimate of (beta), 
 ^^^^^^                                                         
 (beta), is found by the weighted least-squares regression of z on
 X with weights w.
 
                                     1/2         1/2            
 G02GBF finds a QR decomposition of w   X, i.e.,w   X=QR where R 
 is a p by p triangular matrix and Q is an n by p column 
 orthogonal matrix.
 
                            ^^^^^^                 
 If R is of full rank, then (beta) is the solution to:
 
                           ^^^^^^  T 1/2
                          R(beta)=Q w   z
 
 If R is not of full rank a solution is obtained by means of a 
 singular value decomposition (SVD) of R.
 
                               (D 0) T
                           R=Q (0 0)P ,
                              *       
 
 where D is a k by k diagonal matrix with non-zero diagonal 
                                      1/2 
 elements, k being the rank of R and w   X.
 
 This gives the solution
 
                                 (Q  0)      
                     ^^^^^^    -1( *  ) T 1/2
                     (beta)=P D  (0 I )Q w   z
                             1               
 
 P  being the first k columns of P, i.e., P=(P P ).
  1                                           1 0 
 
 The iterations are continued until there is only a small change 
 in the deviance.
 
 The initial values for the algorithm are obtained by taking
 
                            ^^^^^
                            (eta)=g(y)
 
 The fit of the model can be assessed by examining and testing the
 deviance, in particular by comparing the difference in deviance 
 between nested models, i.e., when one model is a sub-model of the
 other. The difference in deviance between two nested models has, 
                       2                                     
 asymptotically, a (chi)  distribution with degrees of freedom 
 given by the difference in the degrees of freedom associated with
 the two deviances.
 
                           ^^^^^^                             
 The parameters estimates, (beta), are asymptotically Normally 
 distributed with variance-covariance matrix:
 
             T                         
        -1 -1                           
     C=R  R    in the full rank case, otherwise
 
          -2 T
     C=P D  P 
        1    1
 
 The residuals and influence statistics can also be examined.
 
                                ^^^^^  ^^^^^^                   
 The estimated linear predictor (eta)=X(beta), can be written as 
   1/2                                                          
 Hw   z for an n by n matrix H. The ith diagonal elements of H, h 
                                                                 i
 give a measure of the influence of the ith values of the 
 independent variables on the fitted regression model. These are 
 sometimes known as leverages.
 
                                ^^^^  -1 ^^^^^ 
 The fitted values are given by (mu)=g  ((eta)).
 
 G02GBF also computes the deviance residuals, r:
 
                                     _____________
                           ^^^^     /       ^^^^ 
                r =sign(y -(mu) )  / dev(y ,(mu) )
                 i       i     i \/       i     i
 
 An option allows the use of prior weights in the model.
 
 In many linear regression models the first term is taken as a 
 mean term or an intercept, i.e., x ,1 = 1, for i=1,2,...,n. This 
                                   i                             
 is provided as an option.
 
 Often only some of the possible independent variables are 
 included in a model, the facility to select variables to be 
 included in the model is provided.
 
 If part of the linear predictor can be represented by variables 
 with a known coefficient then this can be included in the model 
 by using an offset, o:
 
                                --         
                       (eta)=o+ > (beta) x .
                                --      j j
 
 If the model is not of full rank the solution given will be only 
 one of the possible solutions. Only certain linear combinations 
 of the parameters will have unique estimates, these are known as 
 estimable functions.
 
 Details of the SVD, are made available, in the form of the matrix
  *
 P :
 
                               ( -1 T)
                               (D  P )
                               (    1)
                             * (  T  )
                            P =( P   ).
                               (  0  )

Parameters

g02gbf

Required Input Arguments:

x (:,:)                               real
isx (:)                               integer
y (:)                                 real
t (:)                                 real

Optional Input Arguments:                       <Default>

link (1)                              string   'g'
wt (:)                                real     zeros(length(y),1)
mean (1)                              string   'm'
offset (1)                            string   'n'
v (:,:)                               real     zeros(length(y),ip+7)
tol                                   real     10*eps
maxit                                 integer  10
iprint                                integer  -1
epslon                                real     eps
weight (1)                            string   g02gbf04(wt)
ifail                                 integer  -1

Output Arguments:

dev                                   real
idf                                   integer
b (:)                                 real
irank                                 integer
se (:)                                real
cov (:)                               real
v (:,:)                               real
ifail                                 integer